Multi-Objective Genetic Programming Projection Pursuit for Exploratory Data Modeling

نویسندگان

  • Ilknur Icke
  • Andrew Rosenberg
چکیده

For classification problems, feature extraction is a crucial process which aims to find a suitable data representation that increases the performance of the machine learning algorithm. According to the curse of dimensionality [4] theorem, the number of samples needed for a classification task increases exponentially as the number of dimensions (variables, features) increases. On the other hand, it is costly to collect, store and process data. Moreover, irrelevant and redundant features might hinder classifier performance. In exploratory analysis settings, high dimensionality prevents the users from exploring the data visually. Feature extraction is a two-step process: feature construction and feature selection. Feature construction creates new features based on the original features and feature selection is the process of selecting the best features as in filter, wrapper and embedded methods [5]. In this work, we focus on feature construction methods that aim to decrease data dimensionality for visualization tasks. Various linear (such as principal components analysis (PCA), multiple discriminants analysis (MDA), exploratory projection pursuit) and non-linear (such as multidimensional scaling (MDS), manifold learning, kernel PCA/LDA, evolutionary constructive induction) techniques have been proposed for dimensionality reduction. Our algorithm is an adaptive feature extraction method which consists of evolutionary constructive induction for feature construction and a hybrid filter/wrapper method for feature selection.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Projection Pursuit for Exploratory Supervised Classification

In high-dimensional data, one often seeks a few interesting low-dimensional projections that reveal important features of the data. Projection pursuit is a procedure for searching high-dimensional data for interesting low-dimensional projections via the optimization of a criterion function called the projection pursuit index. Very few projection pursuit indices incorporate class or group inform...

متن کامل

Model and Solution Approach for Multi objective-multi commodity Capacitated Arc Routing Problem with Fuzzy Demand

The capacitated arc routing problem (CARP) is one of the most important routing problems with many applications in real world situations. In some real applications such as urban waste collection and etc., decision makers have to consider more than one objective and investigate the problem under uncertain situations where required edges have demand for more than one type of commodity. So, in thi...

متن کامل

انجام یک مرحله پیش پردازش قبل از مرحله استخراج ویژگی در طبقه بندی داده های تصاویر ابر طیفی

Hyperspectral data potentially contain more information than multispectral data because of their higher spectral resolution. However, the stochastic data analysis approaches that have been successfully applied to multispectral data are not as effective for hyperspectral data as well. Various investigations indicate that the key problem that causes poor performance in the stochastic approaches t...

متن کامل

A Multi Objective Genetic Algorithm (MOGA) for Optimizing Thermal and Electrical Distribution in Tumor Ablation by Irreversible Electroporation

Background: Irreversible electroporation (IRE) is a novel tumor ablation technique. IRE is associated with high electrical fields and is often reported in conjunction with thermal damage caused by Joule heating. For good response to surgery it is crucial to produce minimum thermal damage in both tumoral and healthy tissues named Non-Thermal Irreversible Electroporation(NTIRE). Non-thermal irrev...

متن کامل

Combining Exploratory Projection Pursuit and Projection Pursuit Regression with Application to Neural Networks

Parameter estimation becomes difficult in high-dimensional spaces due to the increasing sparseness of the data. Therefore, when a low-dimensional representation is embedded in the data, dimensionality reduction methods become useful. One such method-projection pursuit regression (Friedman and Stuetzle 1981 (PPR)-is capable of performing dimensionality reduction by composition, namely, it constr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1010.1888  شماره 

صفحات  -

تاریخ انتشار 2010